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C/^ ■ Abstract 

Multivariate versions of classical orthogonal polynomials such as Jacobi, Hahn, Laguerre, Meixner 
are reviewed and their connection explored by adopting a probabilistic approach. Hahn and Meixner 
■ polynomials are interpreted as posterior mixtures of Jacobi and Laguerre polynomials, respectively. By 

5*"^ ' using known properties of Gamma point processes and related transformations, an infinite-dimensional 

, version of Jacobi polynomials is constructed with respect to the size-biased version of the Poisson-Dirichlet 

• weight measure and to the law of the Gamma point process from which it is derived. 

1 Introduction. 

The Dirichlet distribution Da on d < cxd points, where a — (ai, . . . , ad) G K^, is the probabihty distribution 
on the (rf — 1)— dimensional simplex 

Cn : d-i 



On 

<^ ' described by 

o 



X 



A(d-i) ■.= {{xi,...,Xd-i) e [O,!]"*-! < 1}, 



Daidxi, . . . , dxd-i) = .^^'"'^ ^ f n ^7'^'] (1 - \x\r'-'dxi ■ ■ ■ dxd-i, 



(1) 



?H I where, for every d G N and z e K'^, \z\ := J^i^i 

■ Such a distribution plays a central role in Bayesian Statistics as well as in Population Genetics. In Statistics, it 

is the most used class of so-called prior measures, assigned to the random parameter X — {Xi, . . . , Xd-i, 1 — 
\X\) of a statistical d-dimcnsional Multinomial likelihood with probability mass function 

:=P(n|X = a;) = J^'^'^x", 7i G N'^ (2) 

where 

1n| 



n / ni! • • ■ rid! ' 
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In Population Genetics, arises as the stationary distribution of the so-called d- types Wright-Fisher 
diffusion process {X{t) : t > 0) on A(c;„i) used to model the evolutionary behavior of d allele frequencies in 
an infinite population of genes with parent-independent, neutral mutation. The generator of the diffusion is 



i=l 



where Sxy is the Kronecker delta, equal to 1 if a; = ?/ and to otherwise. Both models in Bayesian Statistics 
and Population Genetics are equivalently described in terms of what one expects to observe from a collection 
of \n\ individuals (genes) (|n| = 1,2,...) sampled from the entire population (at a given time). When 
the individuals are exchangeably sampled from Dirichlet random proportions X — {Xi, . . . 1 — \X\), 

the probability of finding exactly ni, . . . ,nd individuals (genes) of type 1, . . . , d, respectively, is given by a 
Dirichlet mixture of Multinomial distributions, defined, for every n G N"^, by 

DM^in) = I B.,{n)D^{dx) = f H^i^lM, (4) 



where 

r(a + z) 

There are several infinite-dimensional versions of the Dirichlet distribution, as d — > cx) and |a| \6\ > 0, 
which will be described in Section [51 with substantially similar applications in Statistics and Population 
Genetics. Under a hypothesis of multinomial sampling, they all induce discrete measures which are a 
modification of D Ma- 
in this paper we will review multivariate orthogonal polynomials, complete with respect to weight measures 
given by Da or DMa, that is, polynomials {G„ : n G N'^} satisfying 

G^G^dfi ^ —Sn,n n,mGN'^. (5) 

Cm 

We will call {G„} multivariate Jacobi polynomials if ([5]) is satisfied with fi = Da, and multivariate Hahn 
polynomials if /i = DMa- Here Cm are positive constants. Completeness means that, for every function / 
with finite variance (under /i), there is an expansion 



f{x) = c„a„G'„(x), (6) 

where 

a„=E[/(X)G„(X)]. 



Systems of multivariate orthogonal polynomials are not unique, and a large number of characterizations of 
d-dimensional Jacobi and Hahn polynomials exist in literature. We will focus on a construction of Jacobi 
polynomials, based on a method originally proposed by Koornwinder [16] which has a strong probabilistic 
interpretation, by means of which we will be able to: (1) describe multivariate Hahn polynomials as posterior 
mixtures of Jacobi polynomials, in a sense which will become precise in section [S] (2) construct, in Section 
m a system of multiple Laguerre polynomials, orthogonal with respect to the product probability measure 

with 

d 

r(a) :=nr(aO; 
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(3) derive, in section [SI a new class of multiple Meixner polynomials as posterior mixtures of the Laguerre 
polynomials mentioned in (2); (4) obtain polynomials in the multivariate Hypergeometric distribution by 
taking the parameters in the Hahn polynomials to be negative; (5) obtain (Section 13. 4p asymptotic results 
as the dimension d is let to go to infinity with \a\ — > \d\ > 0. 

The intricate relationship connecting all the mentioned systems of polynomials is entirely explained by the 
relationship existing among the respective weight measures, which becomes more transparent under a prob- 
abilistic approach; with this in mind we will begin the paper with an introductory summary (Section [2]) 
of known facts from the theory of probability distributions. Section 13.11 is devoted to multivariate Jacobi 
polynomials, whose structure will be the building block for the subsequent sections: Multiple Laguerre in 
Sectional Hahn in Section O Meixner in section [HI 

It is worth observing that the posterior-mixture representation of multivariate Hahn polynomials shown in 
proposition [8] is obtained without imposing a priori any Bernstein-Bezier form to the Jacobi polynomials, 
and nevertheless it agrees with recent interpretations of Hahn polynomials as Bernstein coefficients of Jacobi 
polynomials in such a form ([SSlEl]), a result for which a new, more probabilistic proof is offered in Section 
15.2.11 Along the same lines one can view the Meixner polynomials obtained in Proposition \W\ as re-scaled 
Bernstein coefficients of our multiple Laguerre polynomials, as shown in Section [6.1l 

On the other hand, our construction of Hahn polynomials is in terms of mixtures over polynomials in in- 
dependent random variables, and as such, our derivation is closely related to the original formulation of 
multivariate Hahn polynomials, as weighted products of univariate Hahn polynomials, proposed decades ago 
by Karlin and Mac Gregor [T^]. 

The original motivation for this study was to obtain some background material which can be used to charac- 
terize bivariate distributions, or transition functions, with fixed Dirichlet or Dirichlet-Multinomial marginals, 
for which the following canonical expansions are possible: 

r ^ ] 

p{dx,dy) = S 1 + X! ^nPnGnix)Gn{y) > Da{dx) ^i{dy) , x,y e ^(d-i) 

[ J 

for appropriate, positive-definite sequences : m G N'', called the canonical correlation coefficients of the 
model. Results on such a particular problem will be published in a subsequent paper. Other possible ap- 
plications in statistics are related to least square approximations and regression. An MCMC-Gibbs sampler 
use of orthogonal polynomials is explored, for example, in }20 j. In this paper however we will focus merely 
on the construction of the mentioned systems of polynomials. 



2 Distributions on the discrete and continuous simplex. 

2.1 Conditional independence in the Dirichlet distribution. 
2.1.1 Gamma sums. 

Denote by 7|q,|,|;3| (d^:) the Gamma probability density function (pdf) with parameter (|a|, |/3|) £ 

For every a G M!^ and |/3| > 0, let Y — (Yi , . . . , Y^) be a collection of d independent Gamma random variables 
with parameter, respectively, (a^, \(3\). Their distribution is given by the product measure 7^ defined by 
(III). Consider the mapping 

(Yi,...,rrf) ^ {\Y\,Xi,...,Xd-i) 
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where 

^r-=^y j = i,...,d-i 

It is easy to rewrite 

itlPiidy) = l\a\,\i3\id\y\)Da{dx) 

that is: (i) |y| := X!iLi ^» ^ Gamnia(|a|, random variable, and (ii) X is independent of |y| and has 
Dirichlet distribution with parameter a. 



2.1.2 Dirichlet as Right-Neutral Distribution. 

Let X = {Xi, . . . , Xii) a random distribution on {1, . . . , c?} with Dirichlet distribution D^, a £ Mjj.. Consider 
the random cumulative frequencies Sj := X]i=i -^i^ J — li ■ • ■ ? — 1. Then the increments 

B, . ^ , j = (8) 

are independent random variables. This property is known as right-neutrality In particular, each Bj 

has a Beta distribution with parameters (a^, |a| — X]j=i ^j)- 

To see this, rewrite Da in terms of the increments Bj as defined by ([5]), for j = 1, . . . , d — 1. The change of 
measure induces: 



which is the distribution of d — 1 independent Beta random variables. Notice that such a structure holds, 
with different parameters, for any reordering of the atoms of X. 



2.2 Unordered Dirichlet frequencies and limit distributions. 

In many applications the locations of the atoms of a Dirichlet population have no intrinsic, material meaning, 
and it is preferable to look at the distribution of these frequencies in an order-independent framework. Two 
possible ways of unordering the Dirichlet atoms are: (1) rearranging the frequencies in a size-biased random 
order; (2) ranking them in order of magnitude. The main usefulness of considering size-biased or ranked 
distributions, is that they admit sensible limits as the dimension d grows to infinity, whereas the original 
Dirichlet distribution is obviously bounded to finite dimensions. The two resulting distributions (known, 
respectively, as the GEM and the Poisson-Dirichlet distribution) are in a one-to-one relation with each 
other. 

2.2.1 Size-biased order and the GEM distribution. 

Let a; be a point of A(c;_i), with |a;| = 1. Then x induces a probability distribution on the group Qd of all 
permutations of {1, . . . , d}: 
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Let a G R'^. The size-biased measure on A((j_i) induced by a Dirichlet distribution Da is given by 

Da{A) = J a^{TT : TTX e A)Daidx). 
Note that C7x{y} '■= Ux{tt : ttx = y) is nonzero if and only if y is a permutation of x, and that 

^x{y} = ^nx{y} = CT{y} Vtt e C/, 

hence the density of the size-biased measure is 

In particular, if a = {\0\/d, . . . , \9\/d) for some 16*1 > (symmetric Dirichlet), then its size-biased measure is 

d-l 

b\eUdx) = d\\{ Da{dx) (10) 

« ni3fl/'^(l-S,)^''-MB„ (11) 

where Bi is defined as in ([8]). This is the distribution of d — 1 independent Beta random variables with 
parameters, respectively, {\9\/d+ 1, {d — i/d)9 — 1), i — 1, . . . ,d ~ I. 
The measure £'|6(|,d is, again, a right-neutral measure. 

Now, let d ^ CO. Then D^g^ ,^ converges weakly to the law of a right-neutral sequence X°° = (Xi,X2, . . .) 
such that 

X,^B,\{{l-B{), j>l (12) 

i=l 

for a sequence B = (_Bi, i?2, • ■ ■) of independent and identically distributed {iid) Beta weights with parameter 
(1, \0\) (here and in the following pages V means "m distribution'''). 

Definition 1. The random sequence satisfying for a sequence of Beta (1, |^|) weights, is called the 
GEM distribution with parameter \0\ (GEM{\0\)). 



2.2.2 Ranked Dirichlet frequencies and the Poisson-Dirichlet distribution. 

Let Y — (Yi, . . . , Yd) be a vector with distribution 7^ |^|. Consider the function p : R'' — ^ R*^ which rearranges 
the coordinates of F G M"* in a decreasing order. Then Y-^ := p{Y) is known as the order statistics of Y. If 
all coordinates are iid with common parameter then the law l^gy^ given by 

Since |y| is stochastically independent of then it is also independent of f{Y/\Y\) for any function 

/, hence \Y\ is independent of := y-l^/lyl. Denote the distribution of X^ by oj^g^ ^. For a symmetric 
Dirichlet distribution (i.e. with a = {\0\/d, . . . , \9\/d), \6\ > 0) it is easy to verify that size-biased and ranked 
Dirichlet frequencies co-determine each other via the relation: 
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(13) 



(Xi) =X 



(14) 



for any d — 2,3, 



Poisson point process construction ([15j). 

Let Y°° — (Yi, Y2, ■ ■ ■) he the sequence of points of a non-homogeneous point process with intensity measure 

N^g^iy)^\0\y-'e-y. 

The probabihty generating functional is 

J^ie|(0=E|e| (expjy loge(2/)iV|e|(dy)|) =expi^\e\J^ (%) - l)y-ie-^'dy| , (15) 

for suitable functions ^ : M ^ [0, 1]. Then \Y°°\ is a Gamma(|6'|) random variable and is independent of the 
sequence of ranked, normalized points 

|yoo| 

Definition 2. The distribution ofX^°°, is called the Poisson-Dirichlet distribution with parameter \9\ > 0. 

Remark 1. Obviously the GEM{\6\) distribution can be redefined in a similar fashion: consider the same 
point process Y°° and consider reorder their jumps by their size-biased random order, i.e. set 

with (random) probability Yii/\Y°°\ and 

P (ffc+i = ...,%)= /^^ff , fc = 1, 2, . . . 

Denote the vector of all the size- biased jumps by . Then \Y°^'\ = is independent of the normalized 

sequence 



X' 



and X°° has the GEM(\9\) distribution. 



Finite-dimensional distributions. 

An important role in determining the finite-dimensional distribution of the Poisson-Dirichlet process is given 
by its frequency spectrum or factorial moment measure. 

Proposition 1. (Watterson |23j ) Let ^ be the random measure corresponding to a Poisson-Dirichlet point 
process. Then, for every fc S N and distinct xi, . . . , x\r\ with < li 

P{^{dxi) > 0, . . .,£,{dx\r\) > 0) = /|*jr'''(xi, . . .,x\r\)dxi ■ ■■dx\r\ 
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where 

= J l-i]^:, < < . . . < X|h < 1. (16) 

Let F|^j[''^(da;) be the measure with density filV as in p6p . Then 

^Sr'^^'^^) = i_H^'^[l'-|]^(|e|M...,|e|Mi^^|e[)(^^)' a^eA,,,, (17) 

where 

r(a + i) 

" - a,x eR, a + 1 > X. 



r(a + 1 - x) 

ional I 

the GEM{\0\) distribution: 



Note that the finite-dimensional distributions of the size-biased permutation of F^g^^'' coincide with those of 



|r| 



n , Fm(dx) = GEM\o\idx). 



A 1 - ^ 




The relationship between PD and GEM is more understandable if we notice that the probability generating 
functional of -fi^^, for a = {\e\/d, \0\/d), is ([10]) 

- (18) 

a — ^oo 

which, by continuity of the ordering function p, implies that if X^"^ has distribution D^g^ ^, then 

Moreover, continuity of p and the fact that = 1 almost surely ensure that the relations (fT9 |) - ([20|) hold 

in the limit, that is: if X°° has a GEM(|0|) distribution, then 

(X°°)^=X^°°; (19) 

(Xi°°) = X°°. (20) 

Such a duality (of which several proofs are available, see [19] for an account and references) leads to a 
double series representation for the most popular class of nonparametric prior measures on the space Ve of 
probability measures on any diagonal- measurable space E, the so-called Fergus on- Dirichlet class [8]. 

2.2.3 The Ferguson-Dirichlet class of random probability measures. 

Definition 3. Let a be a diffuse measure on some Polish space E. A random probability measure F on 
E belongs to the Ferguson-Dirichlet class with parameter a (FD(a)) if its distribution is such that, for 
every integer d and any B or el- measurable partition A = {Ai, . . . ^Ad), of E, the distribution of the vector 
{F{Ai), F{Ad)) IS D^A where := (a(^i)), . . . , a{Ad)). 
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Theorem 1. Let a be a diffuse measure on some diagonal-measurable space E and denote a{E) = \0\, 
V — al\B\. A random probability distribution F on E is FD{a) if 

oo 

= (21) 

almost surely, where: 

(i) X — (Xi, X2, . . .) is independent of Z = (Zi, Zi-, ■ ■ •); 

(a) Z is a collection of iid random variables with common law v; 

(Hi) X has either a PD(\e\), or a GEM{\e\) distribution. 

2.3 Sampling formulae 
2.3.1 Negative Binomial sums. 

We have already seen in the introduction that the Dirichlet-Multinomial distribution arises as a Dirichlet 
mixture of Multinomial likelihoods. Another construction is possible, based on Negative-Binomial random 
sequences, which parallels the Gamma construction of the Dirichlet measure of Section (HHHl 
Let N B\^\ y{k) : \a\ > 0, denote the Negative Binomial distribution with probability mass function: 

7Vi?„,p(fc) = ^/(l-p)", fc = 0,l,... (22) 

With both parameters in N, such a measure describes the distribution of the number of failures occurring in 
a sequence of iid Bernoulli experiments (with success probability p), before the a-th success. Two features 
of NBa,p will prove useful, in section [6] to connect multiple Meixner polynomials to multivariate Hahn 
polynomials. 

The first feature is that Negative-Binomial distributions arise as a Gamma mixtures of Poisson likelihoods: 



NB, 

where 



poo 

c,p{k)= / Pox{kh^,^{dX), (23) 



^OA(fc)--^, fc = 0,l,2. 



We recapitulate the second feature in the next Lemma. 



Lemma 1. Consider any a € K^J. and p € (0, 1). Let Ri, . . . , i?^ be independent Negative Binomial random 

variables with parameter {ai,p), respectively for i — I, . . . , d. Then 

(i) \R\ := X^iLi ^ Negative Binomial random variable with parameter (\a\,p); 

(a) Conditional on \R\ = \r\, the vector R = {Ri, . . . , Rd) has a Dirichlet-Multinomial distribution with 
parameter (a, |r|). 

For a = {\a\/d, . . . , |q;|/c?) it is now obvious that p(R)/\R\, is independent of \R\. 
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2.3.2 Partial right-neutrality. 

For every r G N"^ and a G R"^"^, denote as usual Rj = Yli=j+i '"'i Aj = a,. It is easy to see that 

DMa{r;R) = [ B^{r)D„{dx) 

rf-i 

= llDM^^,A,{rj;Rj-i). (24) 

3=1 

In other words: for every j = 1, . . . ,d—l, rj/Rj is conditionally independent of ri, . . . , r^-i, given Rj. Such 
a property can be interpreted as a partial right-neutrality property, and we have just seen that it is a direct 
consequence of the right-neutrality property of the Dirichlet distribution. This feature is responsible for our 
construction of multivariate Hahn polynomials. 



2.3.3 Hypergeometric distribution. 

Consider the form of the probability mass function DMa but now replace the parameter a with — e = 
(— ei, . . . , —ed) with < nj < ej, j = 1, . . . ,d. Then 



£>M_,(n) 



(") 



ni\---ndl (-|e|)(|„|) 

nti (") 



(,':',) 



=: H,{n). (25) 



iJj (n) is known as the multivariate Hypergeom,etric distribution with parameter e. 

The partial right-neutrality property of the Dirichlet-Multinomial distribution is preserved for the Hyperge- 
ometric law, however the interpretation as a Dirichlet mixture of iid laws is lost as the Dirichlet (as well as 
the Gamma and the Beta) integral is not defined for negative parameters. 



Limit sample distributions. 

2.3.4 Ranked sample frequencies and the Ewens' sampling formula. 

Let R be an integer- valued, rf-dimensional vector with Dirichlet-Multinomial DMa distribution. As — > oo, 
R/\R\ will converge to a random point in the simplex with distribution D„. We have already seen that limit 
versions of Da as d ^ oo exist only in their unlabeled or size-biased versions. The sampling formula 
converging to the former version is known as the Ewens' sampling formula. 
For any |n|, G N, let G N"^ have distribution DMa '■ oe G K+. The vector of order statistics 

R^ = p{R) 

has distribution 

DM^ir^; \n\) = J2 DMa{<jr^; |n|), G p(N'^). 

For symmetric measures, with a = {\a\/ d, . . . ,\a\/ d) ,\a\ > this is equal to 

DM^aUr-^ \n\) = ^ ('^') - J— m|„|,,(r; |n|) (26) 

^ ^ lli=l "i- 
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where, for i = 1, . . . , |n|, bi denotes the number of coordinates in r equal to i; here k — k(r) := ^« 
the number of strictly positive coordinates (hence |n| = X]l='i ^'^d 

m\aid{r-An\) -.^ — 11 + 1 • 27) 

l"l(H),=i /(n,-l) 

It is possible to interpret the symmetric function 'm\a\,d{'r] \n\) in three ways: 

m|„,,d(r;|n|) = E(X'^) (28) 

(fc i-i \ 

X{^-'---Xl>'~'X{{l'Y.^A (29) 
»=1 3=1 I 

= e(x^:;...X^;^) , {zi<...<z,}C {!,..., d} (30) 



where X has the size-biased distribution as in (fTO)) . and is the ranked vector with distribution 

^\a\ d section [2. 2. 21 The full formula of £'Af|^| ^ is obtained by summing over all equivalent choices of 
indices {zi < . . . < ik} in (|30l) . 

As d — > cxD, assuming |a| ^ |0| > 0, the limit sampling distribution is 

ESF^g\ir;\n\) = M)-^^m^o\ir-,\n\). (31) 

where 

m|,|(r;|n|):= J^n(r,-1)! (32) 

for rj > 1, j ^ 1, . . . , k : rj = \r\. The measure ESF\g\{-; \n\) is known as the Ewens' sampling formula 
for the distribution of the allele frequency spectrum resulting in a sample of |n| genes, taken from a neutral, 
Wright-Fisher Population (i.e. with generator C given by ^) at equilibrium. 

The measure TO|g|(r; \n\) still embodies all the parameters of the model, but the representation (j28p no longer 
makes sense with c? = cx) as there is no positive limit for X; however ([29]) and (jSO]) still hold with X and X^ 
having respectively GEM(|6'|) and PDd^j) distributions. Another representation is 

m|e|(r;|n|)= / x^F^g]\dx), (33) 

with given by pT]) . 

In combinatorics, ESF^-g^ describes the distribution of the cycle lengths in a random permutation with 16*1 
as a bias parameter (for other combinatorial interpretations, see [3], |19)). In Bayesian statistics, this is the 
distribution of the sizes of the unlabeled clusters arising from an iid sample taken from a Poisson-Dirichlet 
random measure. Still in the context of Bayesian nonparametrics, let Xi, . . . ,X„ be iid F where _F is a 
Ferguson-Dirichlet(Q;) random distribution, for a diffuse, finite measure a on some Borel space {E,£) with 
a{E) = \9\. By Fubini's theorem and ((29)) -p0 l) . the probabihty of observing k distinct values zi,...,Zk 
respectively ri , . . . , times is given by 

fc 

ESF\g\{r-\n\)\{^{dz,), 
i=i 

where v = a/\0\. 
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2.4 Conjugacy properties 



The Gamma and the Dirichlet distribution, and similarly the Negative Binomial and the Dirichlet-Multinomial 
distributions, are entangled by yet another property known in Bayesian Statistics as conjugacy with respect 
to sampling. 

A statistical model can be described by a probability triplet {M, M,Ia}a£E where the likelihood function 
Ia{x) depends on a random parameter A living in some probability space {E,£,tt). The distribution tt of A 
is called prior measure of the model. The posterior measure of the model is any version iTxi-) = T^i-lX = x), 
of the conditional probability satisfying 

/ n{B\X = x) [ lx{dx)Tr{dx) ^ [ lx{A)n{dX) a.s.VA e M, B e £. (34) 

J A J Jb 

Definition 4. Let C be a family of prior measures for a statistical model with likelihood Ia. C is conjugate 
with respect to Ia if 

IT E C TTj; e C Vx. 

It is easy to check that both Gamma and Dirichlet measures are conjugate classes of prior measures. 
Bayes' theorem shows us the role as marginal distributions played, respectively, by NBa^p and DM^. 

Example 1. The class of Gamma priors is conjugate with respect to l\ = Po\ on {0, 1,2,.. .}. The posterior 
measure is 

(^w Po\{x)ja,fi{dX) , - , , 

^ NB . ix) = ^3^) 

Similarly, the class of multivariate Gamma prior {7^ |^| : a S M'', \(3\ > 0} is conjugate with respect to 
{Po'i{x),XeR'i,x(EN'^} 



Example 2. The class of Beta priors {-D|a|,|/3| : (|q;|, |/3|) € R+} is conjugate with respect to the Binomial 
likelihood l\ = Bx{-) on {0, 1,2,..., |n|}, for any integer \n\. The posterior distribution is 

,,,, i?A(|r|,|7i-r|)jJ|,|,l^|(dA) 
''^(^^^ = DMH,|^|(|r|;H-M) D^^\+\rm+\n\-\rl{dX). (36) 

Similarly the class of Dirichlet measures is conjugate with respect to multinomial sampling, and so are the 
Poisson- Dirichlet and the Ferguson- Dirichlet with respect to (possibly unordered) multinomial sampling. 



3 Jacobi polynomials on the simplex. 

If X,Y are independent random variables, their distribution Wx.y is the product WxWy of their marginal 
distributions, and therefore orthogonal polynomials Qn,k{x,y) in Wx,y are simply obtained by products 
Pn{x)Rk{y) of orthogonal polynomials with Wx and Wy as weight measures, respectively. 
The key idea for deriving multivariate polynomials with respect to Dirichlet measures on the simplex, and 
to all related distributions treated in the subsequent sections, exploits the several properties of conditional 
independence enjoyed by the increments of Da, as pointed out in Section 12.1.11 A method for constructing 
orthogonal polynomials in the presence of a particular kind of conditional independence, where Y depends on 
X only through a polynomial p{x) of first order, is illustrated by the following modification of Koornwinder's 
method (see [16], 3.7.2). 

Proposition 2. For l,d£N, let {X,Y) be a random point o/M' x M'' with distribution W. Let p : M' ^ R 
define polynomials on M' of order at most 1. 
Assume that the random variable 
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is independent of X. Denote with Wx and Wz the marginal distributions of X and Z, respectively. Then a 
system of multivariate polynomials, orthogonal with respect to W is given by 



X M"*, ne (37) 



where Ni = nj+i + • • • + ni^d, o-nd {-P;['™'''}fceR' ^"'^ {-RmlmeR'* o,re systems of orthogonal polynomials with 
weight measures given by Wx and Wz, respectively. 

Proof. When d = I = I his proposition is essentiaUy a probabihstic reformulation of Koornwinder's construc- 
tion ([16J, 3.7.2). The proof is similar for any l,d. That G„ is a polynomial of degree |n| is evident as the 
denominator of the term of maximum degree in R simplifies with (p(x))"'+^"' t-ni+d 'j-q show orthogonality, 
note that the assumption of conditional independence implies that 

W{dx,dy) = Wx{dx)Wz [j^^i^y) ■ 

Denote 6„ ^ ^[Pn] and c„ = ^[Rl], n 0, 1, 2, . . .. For fc, r e R' and m, s e W^, 

G^k,m){x,y)G^r,s){^,y)W{dx,dy) = f P^{x)P^{x)ip{x)r+'Wx{dx) f R^{z)R,{z)Wz{dz) 



Pr{x)Pr{x)ip{x)f"'Wx{dx)CmSrr.s 
— C-m ^ kr s • 

□ 

3.1 d = 2. Jacobi Polynomials on [0, 1]. 

For d — 2, Da reduces to the Beta distribution, the weight measure of (shifted) Jacobi polynomials. These 
are functions of one variable living in Ai = [0, 1]. It is convenient to recall some known properties of such 
polynomials. Consider the measure 

Wa,b{dx) = {l-xf{l + x)H{xe{-l,l))dx, a,h>-l. (38) 

where 1(A) is the indicator function, equal to 1 if A, and otherwise. This is the weight measure of the 
Jacobi polynomials defined by 



a+l 



1-x 



where pFq, p,q gN, denote the Hypergeometric function (see [T] for basic properties). 
The normalization constants are given by the relation 

p-(.)p-(x)ffi„,(^) ^ , rrLi""irr+"i';tit" ^'- 

2n + a + o+l n\i[n + a + b+l) 

The Jacobi polynomials are known to be solution of the second order partial differential equation 

(1 - x'^)y"{x) + [b-a-xia + b + 2)]y'{x) ~n{n + a + b+ l)yix). (39) 
By a simple shift of measure it is easy to see that, for a, (3 > and 9 :^ a + f3 , the modified polynomials 

^"■^W- , ^ P^-'-'^-\2x-l) a,(3>0 (40) 
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are orthogonal with respect to the Beta distribution on [0, 1] which can be written as 

where u = 2x — 1. Note that impUes that, for every n, P^'^{x) solves 

i^2y{x) — —n{n + — 1)2/(2;) 

where is given by 

For the shifted system {P"'^, D^.p) the constants are 



Vn{a,P) Jo 



9(2„)(^ + n-l)(„)' 



n = 0,l,. 



To prove it one just uses (|^ . dH]) and the property of Beta functions: 



-'(n+m) 



The value at 1 of Jacobi polynomials is 



which implies 



P„"''^(l) = 



(a + 1) 



+ n- 1) 



Denote the standardized Jacobi polynomials with 

P-'\x) 



Pn''il) 



and 



R^'^ix) 



Pn^^{l) 



Obviously 



R'^J{x) = RiP-'-^-'\2x-l). 
Then, by ([32|) and , the new constant of proportionality is 

[Rf,^^{x)fD^^p{dx) 
2 



>(",/3) 



n — 1' 



1 



(0 + 2n- l)0(„_i) /3(„) 
where again we used (j43p for the last equality. A symmetry relation is therefore 

i?^-"(l-x) 



K-'5(x) 



i?^'"(0) 



Note that, if {P*"'''(x)} is a system of orthonormal polynomials with weight measure Da 
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3.2 2 < d < oo. Multivariate Jacobi polynomials on the simplex. 

3.3 Multivariate Jacobi from right-neutrality. 

A system of multivariate polynomials with respect to a Dirichlet distribution on o? < oo points can be derived 
by using its right-neutrality property, via Proposition [51 Let |j„| ~ {n ~ {rii, . . . ,71^) G N'' : \n\ — |m|}. 

For every n e N(i_i.|„| and a £ M'J_ denote Nj = J2i=j+i — SiLj+i '^i- 

Proposition 3. For d < oo, a system of multivariate orthogonal polynomials on the Dirichlet distribution 
Da is given by 

Ki^) - n (t^-^) ~ ^ ^^'-'^ ^^^^ 

where sj = J2l=i ^i- 

Notice that similar systems of orthogonal polynomials could be obtained by replacing, in (j49p . -R„ with 
either P*. (orthonormal Jacobi polynomials) or P^- as defined in (|40p . The choice of R^. is useful because 
of the standardization property 

i?"(ed) = 1 (50) 

where ((5^^ : i = 1, . . . , d). 

A similar definition for polynomials in the Dirichlet distribution is proposed by |17| , in terms of non-shifted 
Jacobi polynomials For an alternative choice of basis, see e.g. [6j. 

Proof. The polynomials in i?"(a;) given in Proposition [3] admit a recursive definition as follows: 

K.,...,n..M^ • ■ ■ , = i?ir'^^+^^^)(xi)(i - x,r^R:i..,,,_, f-^, . . . , , (51) 

where a* = {aj, . . . ,ad) (j < — 1); so Proposition [5] is used with I = l,p(x) = 1 — x and inductively 
on d. The claim is a consequence of the neutral-to-the right property and Proposition [51 for consider the 
orthogonality of a term 



1-Sj-lJ \l-Oj_l_ 

in i?" with a similar term in R"^ for some m = (rrii, . . . , m£;-i)-polynomial. Assume without loss of generality 
that for some j = 1, . . . ,d—l, mk — rik for k — j + 1, . . . ,d~l and m j < Uj . Then Nj — Mj and multiplying 
the product of (l52l) by the corresponding Beta density Daj.Aj{dBj)/dBj, where Bj is as in ([8]), gives 

Since is orthogonal to polynomials of degree less than rij on the weight measure Daj^Aj+2Nj, then the 
integral with respect to dBj of the quantity ([55|l vanishes, which proves the orthogonality. □ 

The normalization constant for {R"} can be easily derived as 



1 



{R^{x)f Da{dx) 



= TT ^"^'^ ■ (54) 

Notice that the same construction shown in Proposition [31 could be similarly expressed in terms of the 
polynomials {Pn-'^''^^^'} or |p*"j^-4j+2Afj | jj^g^g^^j }, the only difference resulting in the 

orthogonality coefficients. 
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3.4 Limit polynomials on the GEM distribution. 

Remember that the size-biased permutation of a Dirichlet distribution is stiU a right-neutral distribution, 
so that orthogonal polynomials can be constructed in very much the same way as in proposition [31 with a 
similar proof. 

Proposition 4. A system of orthogonal polynomials in -D|e|,d is 

Rir^'H^) - n <l/'^+^^"-^'^+^"'^ ( (1 - s,^,r^ . e A(._,),n e N^. (55) 



As d —>■ oo, -D|e|,d converges to the so-called GEM distribution, i.e. an infinite-dimensional right-neutral dis- 
tribution with all iid weights being Beta random variables with parameter (1, 9). Let -D|e|,oo = limd^oo D\g\ ,i 
denote the GEM distribution with parameter \0\. An immediate consequence is 

Corollary 1. For \9\ > 0, an orthogonal system with respect to the weight measure £'|e|,oo is given by the 
polynomials: 

= n ^''^'"'^ (r^ir^^ ~ ^ e A^o, « e : |n| = O, l, . . . (56) 



4 Multivariate Jacobi and Multiple Laguerre polynomials. 

The Laguerre polynomials, defined by 

L\:\{y) - i^^i(-|n|; \a\;y), \a\ > 0, (57) 

are orthogonal to the Gamma density 7|a|,i with constant of proportionality 

^[iH(y)]V(dy) = ^. (58) 







(Note that the usual convention is to define Laguerre polynomials in terms of the parameter \a'\ jal — 1 > 
— 1. Here we prefer to use positive parameter for consistency with the parameters in the Gamma distribution). 

Remark 2. IfY is a Gamma (|a|) random variable, then, for every scale parameter S M+, the distribu- 
tion of Z := \P\Y is ^\a\,\p\{dz) Thus the system 



is orthogonal with weight measure 7|a|,|/3|- 

Let Y G M."^ be a random vector with distribution 7^ |^|. By the stochastic independence of its coordinates, 
orthogonal polynomials of degree \n\ with the distribution of Y as weight measure are simply 



4"'""(2/) = n^"aM)' yeR^n£N„, (59) 
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with constants of proportionality of 



-=E(L°(r)f =n 



Therefore, with the notation introduced in Section 12.1.11 because of the one-to-one mapping 

{Yu...,Yd)^{\Y\,X^,...,Xd), 
one can obtain an alternative system of orthogonal polynomials set on yi , . . . , ?/„ : 
Proposition 5. The polynomials defined by 

\n'\ 



(60) 



(61) 



with n' — (fii, . . . ,nd-i) and R"^ defined by Ili49\ ), are orthogonal with respect to 7^ 

Proof. The proof of ((6T|) is straightforward and follows immediately from Proposition^ with I — 1, X ^ \Y\ 
and p{x) = X (remember that \Y\ is Gamma with parameter (|ck|, |/3|)). 

□ 

From now on wc will only consider the case with |/3| = 1, without much loss of generality. The constant of 
proportionality of the resulting system {L"*}is 



1 



i=l 



(2|n'|) 



1 



^1 a I -\-2\n' 



{\y\) 7a+2|n'|('^|y|) 



(2|n'| 



(62) 



where is as in ([54|) . 

The two systems L" and L"* can be expressed as linear combinations of each other: 

LTiy)^ E ^mC(n)L^(j/) 

|mj = |ra| 

and 

Lniy) = E V'mC™(ri)i^*(2/), 



(63) 
(64) 



I T7i I — I n 



where 



c*min)S\m\\n\ =E[L5J*(y)L° (y)] = c„(to)(5|„||„| . 



For general m,n a representation for c^(n) can be derived in terms of a mixture of Lauricella functions of 
the first (A) type. Such functions are defined ([H]) as 



FA{\a\;b;c;z) = V 



l"^l(|m|)0(m) 



a, 5, c, z G 



where := HiLi ("0(ri) fo'' every v,r & R'^. 
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Proposition 6. For every n denote n' :— {ni, . . . ,nd~i) 

\n\ 



where 



\n'\ 
i=0 



DMa,im)J2dj [ R^,{t)FAi\a\;-m,-j;a,\a\;t,l - \t\,l)Do,idt) 



j=0 -"^(d-l) 



|a|(l„,|)(|a| + 2|n'|)(„^) 



FAi\a\; -i, -Ud, -j; \a\, \a\ + 2i, \a\; 1, 1, 1). 



(65) 



(66) 



Remark 3. An equivalent representation of c'!^{n) in terms of Hahn polynomials will be given in section 

Proof. The building block of the proof is the following beautiful representation due to Erdelyi ([7 ): for every 
la|, |z| e R, a, fc G M'' and n e N'', 



where 



0s(|a|; a; n\ k) = FA(|a|; -n, -s; a, \a\;k, 



(67) 



Now 



C(n) = E[LT{Y)L:,,{Y)] 



E 



= Ea 



4"J+""''(l>'l)|i^l'"''K'mn^".(^:'l^l) 



(68) 



where T S A(^d-i) has distribution Da- The inner expectation is with respect to the Gamma (|a|) measure, 
and the outer expectation is with respect to the Dirichlet distribution. 
We start by evaluating the inner expectation. Since 



\y 



\n\' 



\n'\ 

E 

i=0 



-|n'|(,)r(|a| + |n'|) |„l 



n\a\+2) 



(see 0, P- 156), then 



4rJ^^|"'|(iyi)iy|i"'i = E ''"^rfii''^,V''^'^ ^^"'(i^i)^»°'^'"''(i^i) 



i=0 
\n'\ 

E 



T{\a\+2) 
-K|(,)r(|aH-K|) 



..0 r(|a|+^) 

Y:d,L\"\\y\). 



E-'..4"'(i2/i) 



(69) 
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The second equality in ([Mjl is obtained by applying ([57)1 to (|y|)Ll^^''''^'" '(|y|). There 

Ci,j = \a\, \a\ + 2\n'\;i,nd; 1, 1) 

vanishes for j > i + n^, by orthogonality of and (|67p . The last equality is obtained just by inverting 

the order of summation (note that — = for i > \n'\). 

Now apply again (p7|) to n^=i ^"j i^o^e that, for every a; G M'' 

d 

Hi 



hence by ([M)) 



|a|;a;n;a;)=E|Ll"l(|r|)ni"j(^.|y|)| , 



I LlrJ+""''(|y|)l>^l'"'' n I = ^ 

^d,E Li"i(|y|)ni",^(t.m) = 

J=0 \ j=l J 

\n'\ 

^d,(/.,(|a|;a;n;t,l-|t|). (70) 



3=0 

Therefore, taking the expectation over T yields 

c*min) = E„(K,(T)0,(|a|;a;n;T)) 



H(|n 



^DM^{m)Y,dj f K,{t)FA{\a\;^m,-j,a,\a\;t,l-\t\,l)D^{dt) (71) 



ini! 

j=0 •'^(d-l) 



which is what we wanted to prove. □ 



Remark 4. Note that when \n'\ = 0, c*j(0, . . . ,0, n^) = 1 which agrees with the known identity 

n 

L^+fix + y) = Y,Ln^)LtM^ x,yeR (72) 

(see JBl J- (6.2.35), p. 191), an identity with an obvious extension to the d- dimensional case. 
Remark 5. It is immediate to verify that the coefficients c'^{n) also satisfy 

ijl„,|(l/3-^2/|)|rV|l"'li?"' (^) = E ^n.cUn)L^{\p-'y\), l3eR+. (73) 

^'^'^ |m| = |n| 
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4.1 Infinite-dimensional multiple Laguerre. 

From Reniark[T]it is possible to derive an infinite-dimensional version of {L"*}, orthogonal with respect to the 
law of the size-biased point process Y°°, obtained by Y°° of Section 2. 21 Remember that X°° := yoo/|yoo| 

has GEM(|0|) distribution and it is independent of \Y°° \ = |y°°| which has a Gamma(|6'|) law. The proof of 
the following corollary is, at this point, obvious from Corollary [1] and Proposition [5l 

Corollary 2. Let 7|g| be the probability distribution of the size-biased sequence Y°° obtained by rearranging 
in size-biased random order the sequence Y°° of points of a Poisson process with generating functional US}) . 
The polynomials defined by 

^is,„,(i/)=^!:!r'"''(i^i)(i^i)''''^!."(e) (^4) 

for \m\ G N, n' e : \n'\ e N, with {Rn} as in i56\) . form an orthogonal system with respect to 7je|- 



5 Multivariate Hahn Polynomials. 
5.1 Hahn polynomials on {1,. . . ,N}. 

As for the Laguerre polynomials, we introduce the discrete Hahn polynomials on {1, ... , N} with parameters 
shifted by 1 to make the notation consistent with the standard probabilistic notation in the corresponding 
weight measure. The Hahn polynomials, orthogonal on DMa^p{n; N)^ are defined as the hypergeometric 
series: 

The orthogonality constants are given by 



1 



n = 0, 1, 



,iV. 



1 ^ 

7jT--=T. "(^5 ^)] ' DM^Ar^; N) = 



1 (^ + ^)(n) 



1 



/3(n) 



'■N.n r=0 



Q 0(„_i) + 2n-la(„)- 



A special point value is ([E], (1.15)) 



Thus if we consider the normalization 



then the new constant is, from (j76p . 

1 



/i^^^(A^;iV) = (-1)"^, 



hZ^P{r-N) 



\N-N) 



1 + 



O &in-i) + 2n-l/3(„) 

{0 + N)in) 



a, 13 ' 



(75) 



(76) 



(77) 



where Cn is the Jacobi orthogonality constant, given by ([461 
A symmetry relation is 



q^^''-{N~r;N) 



(78) 
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A well-known relationship is in the limit: 

lim hZ'^^{Nz;N) = RZ~^''^-'^{l-2z) a,/3>0 (79) 

(see [13]) where i?"'** — R'^'' / Rf{'' (1) are standardized Jacobi polynomials orthogonal on [—1, 1] as defined 
in section Because of our definition ([iO]) . combining ([i7|) . ((75|) and ([50]) gives the equivalent hmit: For 
every n, 



mil r^-f^fAT^- AT\ — 

Note that also 



lim q^'^iNz; N) = R'^-^{z) a,P>0. (80) 



lim «;:^;^„ = C'''. (81) 

An inverse relation holds as well, which allows one to derive Hahn polynomials as mixture of Jacobi polyno- 
mials. Denote by Bx{r; N) — B^.i-xir, N — r) the Binomial distribution. 

Proposition 7. The functions 

^^'{r;N) := f RZ'P{x) D^.p{dx) (82) 

= f R^'''ix)D^+r.p+NMdx), n = G,l,...,iV, (83) 

Jo 

form the Hahn system of orthogonal polynomials with DMajj as weight function, such that 

rn '^r; N) = ^^^^ q7/{r; N). (84) 



The representation (I83p . in particular, shows a Bayesian interpretation of Hahn polynomials, as a posterior 
mixture of Jacobi polynomials evaluated on a random Bernoulli probability of success X, conditionally on 
having previously observed r successes out of N independent Bernoulli(X) trials, where X has a Beta(a,/3) 
distribution on {0, . . . , A^}. 



Proof. The integral defined by ([82]) is a polynomial: consider 



(" + 0(n)(/3 + ^-^)(„.) 
(^ + ^)(n+™) 

The numerator is a polynomial in (r, — r) of order n + m. Write 

11 



then 
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where L is a polynomial in r of order less than n. Then is a polynomial of order n in r. 

To show orthogonality it is sufficient to show that ft.„ are orthogonal with respect to polynomials of the basis 

formed by the falling factorials {r[i], Z = 0, 1, . . .}. For I < n, 



r-l 



TVm / x'Rl^P{x)D^^p{dx). 



(86) 



The last integral is nonzero only if / = n, which proves the orthogonality of qn'^{r; N). 
Now consider that, in R"'^{x), the leading coefficient c„ satisfies 



c^x^R'^'''{x)D^_0idx) 



a,/3 



Y,DM^Ar;N)qZ'^{r-N)qZ^P{r-N) = ^ i^Af^^^r; TV) 



■'N,n r=0 



= N, 



x''RZ-f\x)D^^p{dx) 



That is, 



N,n 



w 



N,n 



with w^'^ as in (|75|) . and therefore the identity (I84p follows, completing the proof. 



(87) 



□ 



5.2 Multivariate polynomials on the Dirichlet-Multinomial distribution. 

Multivariate polynomials orthogonal with respect to DM^ on the discrete d-dimensional simplex were first 
introduced by Karlin and MacGregor [12], as eigenfunctions of the Birth-and-Death process with Neutral 
mutation. Here we derive an alternative derivation as a posterior mixture of multivariate Jacobi polynomials, 
which extends Proposition [7] to a multivariate setting. 



Proposition 8. For every a G W^, a system of polynomials, orthogonal with respect to DMa is given by 

qnir;\r\) = I R':,{x)^^D^{dx) (89) 



R^{x)Da+ridx), \n\ = \r\ (90) 



'(d-l) 

-d-1 



ffT 1 (A, + R, + Ni+i), , \ jL 



V (|a| + kl)(Ari) / J-^ 

with constant of orthogonality given by 



^ :^E[gg(jt';H)]^= ^. (92) 



uJn{a-M) ' ' ' (|a| + kl)(„) C^' 
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Proof. The identity between ([5^ and ([50]) is obvious from Section 12.41 and ([?!]) follows from Proposition [7] 
and some simple algebra. For every ti G N'', 

-1) 



(l"l + kl)(|n|) 

where L is a polynomial in r of order less than Therefore |r|) are polynomials of order \n\ in r. 
To show that they are orthogonal, denote 

d 

Piir) n('"*)M 
1=1 

and consider that, for every Z G N : |Z| < |rt| = |r|, 



|m| = |r| I'l''--^ \|m| = lr 



m — I 



x"'-' D^{dx) 



= |r|[|,|] j x'R':Xx)D^{dx) (94) 

which, by orthogonality of i?„, is nonzero only if \l\ = |n|. Since it is always possible to write, for appropriate 
coefficients Cnm 

" + c, 

\m\ — \n\ 

where C is a polynomial of order less than \n\ in x; then 

(l«l + M)(N) 

and by IMD 

E[g?(i?;|r|)g;j(i?;|r|)] = ^ 7^T^^ E [p^ (i?; |r |)] + C" 



E ^^fh / x'^K{x)D^{dx) 



'lllnll 1 c 



(l"l + kl)(|n|) C 



□ 



Remark 6. Note that the representation i91\) holds also for negative parameters, so that, if we replace a 
with — e (e G W^) then \91]) is a representation for polynomials with respect to the Hypergeometric distribution 
(Section\MlM- 
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5.2.1 Bernstein-Bezier coefficients of Jacobi polynomials. 

As anticipated in the Introduction, Proposition [5] gives a probabilistic proof of a recent result of [22], namely 
that Hahn polynomials are the Berstein-Bezier coefficients of the multivariate Jacobi polynomials. Remember 
that the Bernstein polynomials, when taken on the simplex, are essentially multinomial distributions B^(n) = 
seen as functions of x. 

Corollary 3. For every 



=^,(|a|;|r|) ^ g?(m; |r|)i?,(m). 

\m\ — \r\ 



where LUr{\a\; \r\) is given by 
Proof. From Proposition [H 



so 



Hence 



DM^im : \r\)q';!(m; \r\) = E [Bx{m)R^{X)] 

\m\ 

B^{m) = DMa{m; \m\) ^ ^"(to; \m\)Rn{x). 

|n|=0 



J2g?{m;\r\)BAr) = ^ 

m \n\=0 

|r| 

= E 



J2 DA'Um;\r\)q'^im;\r\)q^^im;\r\ 

\m\ = \r\ 



RZ{x) 



|n|=0 



uJr{\a\\ \r\) 



SrnR?M = 



uJr{\a\; \r\) 



which completes the proof. 

Remark 7. By a similar argument it is easy to come hack from \95\) to \89fl . 



(95) 



(96) 
□ 



5.2.2 The connection coefficients of Proposition [6l 

Consider again the connection coefficients c*(to) of Proposition [6] and their representation ([65|l - ([66ll . An 
alternative representation can be given in terms of multivariate Hahn polynomials. 

Corollary 4. Let c'^{m) be the connection coefficients between L"* and L^, as in Section^ Then 

\n\ 



dim) = „^i?M„(m) ^ -^Il-^,{r; \r\ 

|r|=0 1L=1 



(97) 



where n' = (ni, . . . , n^; — 1), 



l«l>nd \n\\ 



and dj is as in Ii66\) . 
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Proof. It is sufficient to use the explicit expression of the Lauricella function Fa in to see that 



DMa{m) 



'\n\ , 



r|=0 lli=l 



(98) 



□ 



6 Multivariate Hahn and multiple Meixner polynomials. 

The Meixner polynomials on {0,1, 2,...}, defined by 



M„(fc;a,p) = 2F1 



-n, —k 



p-1 



a>0,pG(0, 1) (99) 



are orthogonal with respect to the Negative Binomial distribution NB^p. The following representation of 
the Meixner polynomials comes from the interpretation of NB^^p as Gamma mixture of Poisson likelihood 
(formula 

Proposition 9. For a G M+ and p G (0, 1), a system of orthogonal polynomials with the Negative Binomial 
{a,p) distribution as weight measure is given by 

/ 1 \ 

A— ^ 7„+fe,p(dA), n = 0,l,.... (101) 



/o VP 

where L" are Laguerre polynomials with parameter a. 
Proof. For every n, consider that 



00 ^a+fe+ri-lg-^ 

T{a + k)p° 
(a + fc)(„)P" 



A"7a+fc,p(dA) = / -^—^-^-^d\ 



So every polynomial in A of order n is mapped to a polynomial in k of the same order. 

To show orthogonality it is, again, sufficient to consider polynomials in the basis {^[k] : fc = 0, 1, . . .}. Let 

m < n. 

^ /"oo / -1 \ { ^ \a-\-k — l ^ 

k — v fc — J 

= (^^) |E^M^0A(fc)|7a,^(dA) 

L^:(^Ai-^)A"7a,^(dA) (102) 
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where the last hnc comes from the fact that, if K is a Poisson(A) random variable, then 
Now, consider the change of measure induced by 

1-p 



The last line of (fT02l) reads 



p 



A-p/ JO 

The integral vanishes for every m < n, and therefore the orthogonality is proved. □ 

From Lemma[TJ by using Propositions[ni[5]and[ni and Remark[51 it is possible to find the following alternative 
systems of multivariate Mcixncr polynomials, orthogonal with respect to NB^ p{^)- 

Proposition 10. Let a e and p e (0, 1). 

(i) Two systems of multivariate orthogonal polynomials with weight measure NB^ p{r) are: 

d 

M^'^r) = n M^r'^ivi) n £ (103) 

i=l 

and 

*M„"^>) = (l-p)l"'lMl'^l+2|"'l'^'(|r|-K|) (|a + r|)(|„,|) g^,(r;|r|) n G (104) 

where n' — [ni, . . . , Ud ~ 1) , {M"^'P} are Meixner polynomials as in Proposition\^ and Qa are multivariate 
Hahn defined by Proposition\^ 

(ii) A representation for these polynomials is: 



^ LI (a^) 7^+,,p(rfA), (106) 



*M-^(.) . /,.]^^r(Al^)<^(^A) (107) 
' Lr f aI^) 7^+,,p(rfA), (108) 

VP/ 

where {i^} and {i"*} are given by h59\) and ^61\). and 

d 



i=l 



(Hi) The connection coefficients between {M"} and M"* are given by 



E 

where c*„(n) are as in I165\) or IP' 
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Proof. (UnSl) is trivial and (fTU5)) - piISl) follow from piin)) - piIT|) . 

Now let us first prove (|107[) - (|108[) . For every z e W^, denote x = z/\z\. Consider that 



and that 



la,\i3\(dz) = 7|„|_|^|(d|z|)D„(da;) 
Pof(r)=Po|,|(|r|)L.(r). 



Combining this with Lemma [1] 



1-p 



I I ' 1-p 



(110) 



From Proposition [51 the last integral in (|110p is equal to \r\). 
The first integral can be rewritten as 

\n'\ 



P 



|A| 



1-P 
P 



(l-p)l"'l(|a + r|)(|„,|) /lH+^I"'I |A| 



1-P 



e ^ 



= (l-p)l"l(|a + r|)(|„,|)..„^ 



'I) / I''"' p y r(|a + r + n'|)pl"+'-+" 

Ae+^l"'l(|r| - \n'\) 



TjdlM 



(111) 



The last line in piip is obtained from (jlOip by rewriting \n'\ — 2\n'\ — \n'\ in the mixing measure. Thus the 
identities (fT07| - (fT08l) are proved. 

To prove part (iii), simply use (|63p with coefhcients given by Proposition [5] to see that (|105p - (|106p and 
(UnZD-dlMl) imply 



\m\ — \n\ 



|m| = |n| 

^ C(n)M;^-P(r) 



A 



1-p 

p 

1-p 

p . 



I m I — I n I 



This is equivalent to (|109p because of the orthogonality of M^'P{R). 

But (|109p also implies that {*M"'P(r)} is an orthogonal system with NB^^ as weight measure since, for 
every polynomial r[;] of degree |^| < \n\, 



rSN'' |m| = |n| yrSN'' 

The term between brackets is non-zero only for |^| = |m| = which implies orthogonality, so the proof of 
the proposition is now complete. □ 
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6.1 The Bernstein-Bezier coefRcients of the multiple Laguerre polynomials. 

The representation of Meixner polynomials given in Proposition [TUl leads . not surprisingly, to interpret these 
as the Bernstein-Bezier coefficients of the multiple Laguerre polynomials (for any choice of basis), up to 
proportionality constants. Note that, for products of Poisson distributions we can write 



|A|! 



-Bxir). 



(112) 



To simplify the notation, let {L^,M„) denote either {L^,M^^p) or {L^,*M^'P), for some a G R'^ and 
p e (0, 1), and set pr{a,p)-^ := E[M^]. 



Lr A 



1-p 



Corollary 5. 

, ^ Pria,p)-rT77-yMr{m)Bxim). 

p y A ! ^ 

Proof. The proof is along the same lines as for Corollary [31 From (|105p - (|107p . 

1-p' 



E 



L„ Y 



P 



Mnim)NBl(m), 



Then from pT^ . 



Bxim) = \X\le\^\NBlp{m) ^M„(m)L„ [Y 



So for every r gN'^ 

Y,Mr{m)Bx{r 



lAllel^'E 

n 

|A|!el^l^i„(y 



Y,NB-i^p{m)Mn{m)Mr{r 

m 

i-p\ 1 



Y 



1-p 



P J Pr{a,p) 



|A|!el^l / 1-p 



Pr{a,p) 



and the proof is complete. 



(113) 



□ 
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